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Perpetuity property of the Dirichlet distribution 

Pawel Hitczenkot Gerard Letac ^ 



Abstract 



Let X, B and Y be three Dirichlet, Bernouhi and beta independent random 
■ variables such that X ~ T>{aQ, . . . , ad), such that Pr{B = (0, . . . , 0, 1, 0, . . . , 0)) = 

Ph . CLi/a with a = X^^Lo ™d such that Y ~ /3(l,a). We prove that X ~ X(l — 

' y) + This gives the stationary distribution of a simple Markov chain on 

' a tetrahedron. We also extend this result to the case when B follows a quasi 

Bernoulli distribution Bk{ao, . . . ,ad) on the tetrahedron and when Y ~ (3{k,a). 
We extend it even more generally to the case where X is a Dirichlet process and 
i3 is a quasi Bernoulli random probability. Finally the case where the integer k 
J> . is replaced by a positive number c is considered when ao = . . . = = 1- 

Keywords Perpetuities, Dirichlet process, Ewens distribution, quasi Bernoulli 
' laws, probabilities on a tetrahedron, Tc transform, stationary distribution. AMS 

(N '■ classification 60J05, 60E99. 

o 

^ : 1 Introduction 

In a recent paper [1], Ambrus, Kevei and Vigh make the following interesting observa- 
^ ■ tion: If V, y, are independent random variables such that \^ ~ ^{~v'^)~^^'^l(-i/2,i/2){v)dv, 

c5 : y is uniform on (0, 1) and Pt{W = 1) = Pt{W = -1) = 1/2 then 

W 

Vr^V{l-Y) + —Y. 

The law yU of a random variable V satisfying V ~ VM + Q where the pair (M, Q) 
is independent of V on the right-hand side is often called a perpetuity generated by 
the law V of (M, Q). Thus, another way of stating the observation from [1] is that an 
arcsine random variable on (—1/2, 1/2) is a perpetuity generated by the distribution of 
(M, Q) ~ (1 — y, pyy/2). Part of the reason we found it interesting is that there are 
relatively few examples of exact solutions to perpetuity equations in the literature. We 
will generalize this result, and our generalization will provide more examples of explicit 
generation of perpetuities, including power semicircle distributions (see [2]). Note that 
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the same perpetuity can be generated by different f. See Section 6 below for a reminder 
of tlie classical link between perpetuities and the stationary distributions of the Markov 
chains obtained by iteration on random affine maps. 

To carry out the generalization, we reformulate on (0, 1) the result in [1] by writing 
X = y+i and S = {l + W)/2. Clearly X,y,S are independent with X ~ ^(1/2, 1/2), 
-B ~ |(5o + ^i) and X ~ X{l — Y)-\-BY. In Theorem 1.1 below, we give a generalization 
of the result in [1] expressed in terms of X. Y and B. We will need the following nota- 
tion. The natural basis of R"'^-'^ is denoted by Cq, . . . Cd- The convex hull of {cq, . . . e^} is 
a tetrahedron that we denote by Ed+i- The elements of Ed+i are therefore the vectors 

A = (Ao, . . . , Ad) of R'^'*'-'^ such that Aj > for i = 0, . . . , d and such that AqH h A^ = 1. 

If PO) ■ ■ ■ )Pd are positive numbers with sum equal to one, the distribution Y^'l^^PiSei 
of B = {Bq, . . . , Bd) G Ed+i is called a Bernoulli distribution. By definition B sat- 
isfies Pt{B = Cj) = Pi. If ao, . . . , ad are positive numbers the Dirichlet distribution 
P(ao, . . . , ad) of X = {Xq, . . . , Xd) G Ed+i is such that the law of {Xi, . . . , Xd) is 

(1 - xi Xd)"°-^x?^-^ . . . x^Z-^lT.ixi, . . . , Xd)dXi, ...dXd 



B{ao, ...,ad) 

where S(ao, . . . , a^) = r(r')'+ "'+"1) where is the set of (xi, . . . , Xd) such that Xi > 
for all i — 0,1, ... ,d, with the convention xq = 1 — Xi ■ ■ ■ — Xd- For instance, if the real 
random variable Xi follows the beta distribution 

P{ai,ao){dx) = rx'^'-^l - xr-H^o,i){x)dx 

r>[ai, ao) 

then (1 — Xi, Xi) ~ P(ao, ai). 

Theorem 1.1: Let ao, . . . , be positive numbers. Denote a = ao + ■ ■ ■ -|- a^. Let X, Y 
and B be three Dirichlet, beta and Bernoulli independent random variables such that 
X ~ ^'(ao, ■ ■ ■ ,ad) and B ~ ^^=o ^^ei are valued in 1^^^+^ and such that Y ~ a). 
Then X ^ X{l-Y) + BY. 

Comments. Considering each coordinate. Theorem 1.1 says that for alH = 0, . . . , d 
we have X^ ~ Xi{l — Y) + BiY. Since 1 = ^^=o -^i ~ Yli=o statement for i = is 

true if it is verified for i — 1, . . . ,d. For instance for d = 1 a reformulation of Theorem 
1.1 is 

Theorem 1.2: Let ao,ai > 0. Let Xi,Y,Bi be three independent random variables 
such that ~ /3(ai,ao), Y ~ /3(l,ao + ai), Bi ~ ^(5o + ^^i. Then X^ ~ 
X,{l-Y) + BiY. 

Therefore the initial remark contained in [1] is a particular case of Theorem 1.1 for 
d — 1 and ao = ai = 1/2. More generally, the case d — 1 and ao = ai covers the power 
semicircle distributions discussed in [2] (with ^ = ao — 3/2). In particular, ao = ai = 3/2 
is the classical semicircle distribution. Theorem 1.1 is proved in Section 2. On the other 
hand, we shall see Theorem 1.1 as a particular case of the more general Theorem 4.1 



in Section 4 below. Stating Theorem 4.1 needs the introduction of a new distribution 
i3fc(ao, . . . , Orf) on the tetrahedron -E^+i. We call it a quasi Bernoulli distribution of 
order k. It is concentrated on the faces of order less than in a way that we will 
made reasonably explicit in Section 3. We add here a family of laws with interesting 
properties to the zoo of distributions on a tetrahedron. 

Finally, one can prove Theorem 1.2 by showing E(Xi(l — Y) -\- BiY)"') = E(X") 
for all integers n. Our proof of Theorem 4.1 is somewhat linked to this method of 
moments. For this proof, we need to introduce the Tc transform of a distribution on 
the tetrahedron Ed+i- This is done in Section 2. We will prove several important 
properties of the Tc transform in Theorem 2.1. Theorem 5.1 extends Theorem 4.1 to 
random probability measures on an abstract space Q, where the Dirichlet distribution 
is replaced by the so called Dirichlet process governed by the positive measure a on Q. 
Surprisingly, this construction of Section 5 uses the Ewens distribution. Section 6 gives 
a standard application of the preceeding results to certain Markov chains on Ed+i- 

2 The Tc transform of a distribution on the tetrahe- 
dron 

In the sequel if / = (/o, • • • , /d) and x = {xq, . . . , Xd) are in R'^+^ we write (/, x) = 
Yli=o fi^i ci-nd we denote 

Ud+i = {/ = (/o, . . . , fd) e R'+' ; /o > 0, . . . , /rf > 0}. 

Let X = (Xo, . . . ,Xd) be a random variable on -E^+i. Let c > 0. The transform of 
X is the following function on Ud+i '■ 

T,(X)(/) = E((/,X)-^). 

Its existence is clear from Tc{X){f) < (miuj/j)"^ < oo. It satisfies Tc(X)(A/) = 
A~^Tc(X)(/). The explicit calculation of Tc{X) is easy in some rare cases, including 
the Dirichlet case V{ao, . . . , a^) when c = a = ao + ■ ■ . + and the Bernoulli case 
Ylt=oPi^ei- some sense, the present paper originated from an effort to compute 
Tc{X) when X ~ T>{aQ, . . . , a^) and c = a + k where is a positive integer. For d = 1, 
knowing the transform is equivalent to knowing the function t E((l — tX)~^) on 
(— oo, 1) when X is a random variable valued in [0, 1] since 

T,((l - X, X))(l, l-t)= E((l - tX)-'). 

The Tc transform is a tool which is in general better adapted to the study of dis- 
tributions on the tetrahedron than the Laplace transform E(exp(— (/, X))). The next 
theorem gathers its main properties. It shows for instance that Tc{X) characterizes the 
distribution of X and gives in a crucial probabilistic interpretation to the product 
Ta(Xo)Tb(Xi) when Xq and Xi are independent random variables valued in Ed+i- 

Theorem 2.1: 



o 



1. If X and Z are random variables on -E^+i and if there exists c > such that 
T,(X)(/) = r,(Z)(/) for all / G f/^+i then X ^ Z. 

2. If A; is a non-negative integer and if = — + ■ ■ ■ + ^) then 

E^T,{X) = (c)fcT,+fc(X), (1) 
where (c)„ is the Pochhammer symbol defined by (c)o = 1 and (c)„+i = (c)„(c+n). 

3. If (ao, . . . , Orf) G f/rf+i with a = oq + . . . + and if X ~ 2^(ao, . . . , a^) then 

T,(X)(/) = /o-"».../r^ (2) 

4. Suppose that Xq, . . . , Xr,Y are independent random variables such that Xj G 

for i = 0,...,r and F = (Foi-'-j^r) ^ -^'r+i has Dirichlet distribution 
T>{bQ, . . . ,br). Then for 6 = 6o + ■ ■ ■ + &r and for Z = XqFo + • • • + X^Fr we have 
on f/d+i : 

n{z){f) = n,{Xo){f)...n,{x,)if). (3) 

In particular, if y ~ fog) we have 

T,„+,,((l - Y)Xo + rXi) = T,„(Xo)T,,(Xi). (4) 

5. The probability of the face xq = • • • = Xfe = is computable by the Tc transform: 

hm T,(X)(/o, . . . , /o, 1, 1, . . . , 1) = Pr(Xo = X^ = . . . = X^ = 0). 

/o->oo 

Proof: For part 1) fix (? G M^^^, set /j = 1 — tgfj for t small enough and develop 
1 1—)- ]E((/, X)~'^) in a neighborhood of t = 0. Since (/, X) = 1 — t{g, X) we have 

T,(X)(/) = E((l-t(^,X))-^) = 5^^E((^?,Xnr. 

— ' n! 

n=0 

It follows from the hypothesis T^{X) = T^{Z) that E((c/,X)") = E((^,Z)") for all n. 
Thus ((7, X) ~ Z) since both are bounded random variables with the same moments. 
Since this is true for all g G W^^^ we have X ~ Z. Formula ([I]) is easy to obtain 
by induction on k using the fact that Xq + . . . + X^ = 1. Let us give a proof of 
the standard formula ([2]) by the so called beta-gamma algebra. It differs from the 
method of Proposition 2.1 in [4]. We write ^c{dv) = e~'^v'^~^l(^Q^oo){v)dv /T (c) . Consider 
independent Vq, . . . ,Vd such that Vi ~ 7^. and define V = Vq + . . . + Vd and Xj = Vi/V 
for all i = 0, . . . ,d. Recall that (Xq, . . . , X^) ~ l^ido, ■ ■ ■ , dd) is independent of ~ 7^. 
Therefore 
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Formula (13]) follows from by replacing X, ao, • • • , ctd by F, foo, • • • , &r and / by (/, Xq) ,...,(/, Xr) ■ 
Using conditioning and the independence of Xq, . . . , Xr we obtain 

T,(Z)(/) = E (E([J]r,(/,X,-)]-^|Xo,...,X,) ) = E (n(/,X,)-'M = f[n^{X,){f). 

\ i=0 / \j=0 / j=0 

Applying © to (Fo,n) = (1 - F,^) ~ I^(6o,&i) we get Z = {1 - Y)Xo + YX^. 
This leads to . Property 5 is obvious since the events Xq + ■ ■ ■ + Xk = and 
Xq = Xi = . . . = Xk = coincide. □ 

Proof of Theorem 1.1 We prove Theorem 1.1 by taking Xq = X, Xi = B, bi = 1 
and 6o = in dlj). Thus 

r,(S)(/) = i(^ + --- + ^). 

a V/o fdj 
The trick for computing Ti_|_a(X) is to observe from ([T]) and ([2]) that 

^ \i=o •'V i=o 

From dH) we also know that for Z = (1 - Y)X + YB we have Ti+„,(Z) = T„(X)Ti(5). 
Thus Ti_|_a(Z) = Ti_(_a(X). Part 1 of Theorem 2.1 implies X ~ Z. □ 



3 The quasi Bernoulli distributions on a tetrahedron 

First, we slightly extend the definition of a Dirichlet distribution T>{aQ, . . . , aa) by al- 
lowing flj > instead of > while keeping a = ^^1=0 > 0. For such a sequence 
(oo, . . . , Orf) we define the nonempty set T = {i ; Oj > 0}. We say that V{aQ, . . . , a^) 
is the Dirichlet distribution concentrated on the tetrahedron Et generated by (ej)jgT 
with parameters (aj)jgT. If X ~ V{aQ, . . . , a,) the formula E((/, X)"-^) = Uto f^"' still 
holds. If T contains only one element Iq then 'D{aQ, . . . , a^) is simply de^^ and does not 
depend on a. Now recall a simple combinatorial formula where A; is a positive integer 
and a = ttQ + ■ ■ ■ + ttd : 

(bo,-.-,''£j)eNd+i; i=0 * 

The proof is immediate if we use generating functions: expand ]^^^Q(l—t)~°' = {l—t)~°- 
in a power series on both sides. We now define our new distributions. 

Let Oq, . . . , > and a = Qq + ■ ■ ■ + ad and let /c be a positive integer. The quasi 
Bernoulli distribution of order k is the distribution on the tetrahedron Ed+i defined as 
the mixing of Dirichlet distributions 

B,(ao,...,a,) = -^ n^P(6o,...,M- (6) 

i)0H HiJd^* 



Formula shows that (jS]) is indeed a probabihty on E^^i. Setting c = k in Theorem 4.3 
below gives an explicit form of Bk{aQ, . . . , a^) in the particular case = . . . = = 1. 
For the sake of simplicity of the next statement denote 



d 

a, 



i=0 J I 



Proposition 3.1: If _B ~ Bk{ao, . . . ,ad) then 



d 



bQ + --- + b^=k 



k jnij 



^ V TT (9) 

{a)k ^ ^^j-^imil' ^ ' 

{mi,.-.,mfc)6N'=; J = l 
ra\^1m^^ hfe^n^ — fc 



Proof. Formula ([8]) is obvious from the definition of By^ia^^ . . . , a^) and formula ([2]). To 
prove ([9]) denote by j^Bk and by j^Ck the right-hand sides of (jH]) and (jH]) respectively. 
Now 

fe=0 mi,m2,... j>l ^' \j>l 

Similarly one computes Xlfclo ^kt^i leading to B^. = and ([9]). □ 

The remainder of this section is made of several remarks on Bk{aoi ■ ■ ■ , ad)- Section 
4 contains further information about it. If T C {0, . . . ,d} denote by Ft the relative 
interior of Et- This set is sometimes called a face of E^+i- It is equal to the relative 
interior of E^+i if T = {0, . . . ,d} and the family of Fj^'s is a partition of E^+i. Therefore 
Bk{ao, . . . , ad) is a mixing of distributions on the faces Ft which have densities hk,T with 
respect to the uniform distribution At on Ft- Here At = 'D{bo, ■ ■ ■ ,bd) where 6j = 1 if 
i & T and 6i = if not. Observe that if k < d only faces of dimension less than k are 
charged by Bk{ao, . . . , ad). To be more specific, denote ar = J2ieT = J2ieT ^i- 

When restricted to (ai)i6r formula ([5]) becomes 

{ai)b, {aT)k 



E n 



k\ 



A probabilistic interpretation of this is 

Bk{ao, . . . ,ad) i [J Fs\ = {aT)k/ {a)k- 



\SCT 



a 



Since the Fs are disjoint, for S C {0,...,d} the weights Ws = Bk{aQ, . . . ,ad){F> 



s) 

satisfy XIsct^-^ ~ ^'{I)^ ■ '^^^ principle of inclusion-exclusion therefore implies that 
~ Jayk '^scT(~'^y^'^^K'^s)k- Let us introduce the symmetric polynomial 

P,{ao,...,a,)= (-l)'+'"""Mfc. 

Sc{0,...,d} 

Its explicit calculation is not easy. With the convention Pq = 1, here is a generating 
function: 

°° j-k °° j-k 

j:Pkiao,...,a,f- = Y: (-1)'^^"'"Em4= S (-1^^-1^1(1 -t)- 

k=0 ' Sc{0,...,d} k=0 ' Sc{0,...,d} 



i=0 1=0 

In particular, formula (fTOj) shows that Pk{cio, • • • , Od) = if < and that 

Pd+i(ao, . . . , Od) = (c? + l)!ao . . . a^. 

With this notation we have wt = j^Pk{{cii)i(zT) (recall that X]tc{o 4""^^ ~ -'-)• 
Another representation of the quasi Bernoulli distribution as a sum of mutually singular 
measures is 

i3fc(ao, . . . ,ad) = Y wxhu^T^T- (H) 

Tc{0,...,d} 

For simplicity denote h^^T by hk^d in the particular case T = {0, . . . , d}. Of course it is 
not zero only iik> d + 1. The following proposition gives a generating function for the 
sequence {hk^d{x))k>d+i in terms of confluent functions 

Proposition 3.2: For a, 6 > denote iFi(a; b; z) = Y.'^^o ^Sfer^"' ^^^^ 

fc=d+l ^" i=0 * 

Proof: Restricting (llip to the interior of Ed+i we get, by writing Ui = hi — 1 and by 
using the definition of the Dirichlet distribution 



k 



Pk{ao, . ■ . ,ad)hk,d{x)lEa+i{x)dx = ^ j J]^ | P(6o, • • • , Ml^^a;) 

bi>0 Vi; 



ni>0 Vi; \i=0 V « ' y « / 

Multiplying both sides by t^^'^^^ and summing on /c = d + l,d + 2, . . . will give the 
proposition. □ 



4 Perpetuities for quasi Bernoulli 



We compute now the Tk transform of a quasi Bernoulli distribution Bk{ao, ■ ■ ■ , CLd) and 
we deduce from it the desired extension of Theorem 1.1. 

Theorem 4.1: Let Oq, . . . , > with a = + ■ ■ ■ + and let be a positive integer. 
Suppose that X ~ T>{aQ, . . . , ad) and B ~ Bk{ao, ■ ■ ■ , CLd)- Then 

In particular, if X, B and F ~ a) are independent then 

X ~ {l-Y)X + YB. 



Proof: We have introduced the differential operator H on Ud+i in Theorem 2.2. Con- 
sider the function F[f) = HiLo /i""* ~ Ta{X){f ). The idea of the proof is to compute 
F-^H^{F) in two ways. A multinomial expansion shows that 

H'=k\ Yi n 



(60,.-.,iid)eNd+l; «=0 -I I 

60 + -'+i'd = fc 



We also observe that 



d 



J=0 



Combining these last two formulas with the definition of Sfc(ao, . . . , a^) we obtain 
that F~^H^{F) = {a)kTk{B). On the other hand by applying formula ([I]) to X ~ 
©(oo, . . . , ad) and to c = a we get F~^H^{F) = (a)fcTa+fc(X). Comparing the two re- 
sults yields the proof of Ta{X)Tk{B){f) = Ta+ki^)- Applying (jlj) completes the proof 
of the theorem. □ 

Corollary 4.2 : limfc_5.oo Bk{ao, . . . , ad) = ^^(ao; • • • ? (^d) where the convergence is in 
the weak sense. 

Proof: If X ~ I)(ao, . . . , Od), Yk ~ f3{k,a) and Bk ~ Bk{ao, . . . ,ad) are indepen- 
dent, Theorem 4.1 shows that (1 — Yk)X + YkBk ~ T>{aQ, . . . , ad)- Since Ed+i is com- 
pact there exists a subsequence A;„ — >„_>.oo oo and a probability yU on Ed+i such that 
'BA;„(ao, . . . , ad) — >„_s.oo in the weak sense. Furthermore the distribution of 1 — con- 
verges to the Dirac mass 6o : a quick way to see this is to consider the Mellin transform 
for s > : 

L [a) L [a + k + s) 



Therefore the distribution of {l — Yjs)X + YkBk converges weakly to /j,. As a consequence 
fi = V{aQ, . . . , ad) and does not depend on the particular subsequence {kn)- This proves 
the result. □ 

Theorem 4.1 implies that for all integer k and for X ~ ©(ao, . . . , a^) there exists a 
probability distribution for B such that Tk{B) = '^^+°^-* . One can wonder whether 
this statement can be extended to positive real numbers c instead of the integers k or 
not. More specifically, does there exist a distribution Bc{ao, ■ ■ ■ , ad) on Ed+i for B such 
that Tc{B) = -^'^l^? We observe easily that it cannot be true: Taking c as a positive 
number, and X uniform on (0, 1) and Y ~ /3(c, 2) with X and Y independent; then 

• If < c < 1 it is impossible to find a distribution for a random variable B 
independent of X, Y such that X ~ (1 - Y)X + YB. 

• If c > 1 and if S ~ ^((^0 + ^i) + ^^(o,i)ib)db is independent of {X, Y) then 

X~ {1-Y)X + YB. 



More generally we prove the following 

Theorem 4.3. Let c be a positive number. For a non-empty set T C {0, . . . ,d} we 
denote by At the uniform probability on the convex set generated by {cj ; i E T}. 
Introduce also the uniform probability on the union of the faces of Ed+i of dimension k 

{k+md-k)\ ^ 

^ ^ Tc{0,...,d};\T\=k+l 

and consider the signed measure on Ed+i defined by 

_ d\{d + l)\ s~^ {c-l){c-2)...{c-k) 

"'"''^ ~ {c+l){c + 2)...{c + d) ^ k\{k + l)\{d - k)\ 

Then for all > 0, i = 0, . . . , d : 

l^cAdx) _ r r f ^d{dx) 



In particular, if F ~ f3{c,d + 1) and X ~ are independent, then there exists a 
random variable B on Ed+i independent of {Y,X) such that X ~ (1 — Y)X + YB if 
and only if either c is a non-negative integer or if c > d. Under these circumstances 

Comments. Note that A^ = . . . , 1). Therefore the theorem says that the quasi 
Bernoulli distribution Uc^d = f3c{l, ■ ■ ■ ,1) with continuous parameter c does exist if and 
only if either c is an integer or is > d. For d — 2 denote by the uniform distribution 



n 



on the segment e^, e^ , and by A2 the uniform distribution of the triangle with vertices 
Co, Ci, 62- Then for c = 1 or c > 2 



z/e,2 = B,{1, 1, 1) = j-—^^^—^ (2(4o + 5ei + 4J + 2(c - l)(Aoi + A02 + A12) + (c - l)(c - 2)A2) . 



1 

(c + iKc + 2) 

The proof of Theorem 4.3 is intricate enough to lead us to think that the existence 
of Bc{ao, . . . , ad) for arbitrary positive numbers (ao, . . . , fld) is a delicate problem, even 
when all are equal. To illustrate this for c? = 2 we have to study whether there exists 
or not a positive probability ii{db) on [0, 1], depending on gq, ai and c, such that for 
all (/o, /i) = (1,1- t) we have 

UjJU • 



H-thf B{ao,ai)Jo (1 - te)"o+«i+c 

Application of Property 5 of Theorem 2.1 shows that necessarily ^{db) has atoms at 
and 1. As shown by Proposition 4.4 below the mass at is given by the limit 

A similar result holds for the mass at 1. However finding the part of fi{db) concentrated 
on (0, 1) is challenging. One can postulate that it has a density / which therefore 
satisfies, in terms of the Gauss hypergeometric function 2-^1 : 

/ n TJ^c ^ (1-^) 2Fi{ao+ai+c, ai, oo+oi, t) — ^ — — -r-. 

Jo {l-tb)" B{ao,ai) B{ao,ai) [1 - 

(14) 

If Oo and Oi are positive integers one can show that / is a polynomial of degree Oo+Oi — 2 
with a complicated expression. 

Proof of Theorem 4.3. Since all probabilities Ao,...,Arf are mutually singular, 
clearly i/^d is a positive measure if and only if either c is an integer or if c > d. We 
observe also that 

(Compute the coefficient of z'^^^ on both sides of {l + zY{l + zY = {l + zY^'^ to see this). 
This proves that the total mass of i>c,d is one. Denote for simplicity Fc{fo, . . . , fd) — 
fp //tc+d+i - This is a symmetric function of the f-s. We now show by induction on 
d that ^ 

■ ■ ■ ' ^ (c+i)(c+2)...(c+d) 5 7ru;;;{f;^y ^^^^ 

This is correct for d = 1 since 

dxi 1 / 1 1 \ 



{foil - x^) + f^x^y+^ (c + l)(/o - A) V/^^ /o^+V 



1 A 



Assuming that (I16p is true for d — 1 we write (recall that is the tetrahedron defined 
in Section 1 and that its Lebesgue measure is l/d\): 



F,{fo,...Jd)=d\ [ J— 

Jt. (/o(1 - 



dxi . . . dxd 



r / /-i-^i ^d-i dxd \ 

^ "^'L.Al (/o(l - XI Xd) + hxr + --- + fdXdY^'^'J "^"^^ • • • 

Q 

^ 7 1 j\/r 7T i^cifl, /2, • • • , fd) — Fc{fo, /l . . . , fd-l)) ■ 

(c + d)(/o - fd) 

The last equality is enough to extend (fT6l) from d — 1 to d. We now apply (|T6l) to the 
computation of J^^ ^ {{f'x)Y ^^en |T| = A; + 1 by changing (rf, c) to (k, c — k — 1) and 
we get 



if,':}' (c-k)(c-k+l)...(c-l)^fr''n,^,,,T(f,-f.y 
From this calculation, statement (fT2l) becomes equivalent to 



X] .c+l-ITI .X /o • • • /o! XI fc+l T-r if.-f.y ^"^^^ 



Observe now by inversion of summations on the left-hand side that (fT7j) can be rewritten 

as 

/. ^ 1 TT fl 



EiE n =Ein 



i=0 TBi j^i, jeT ■^'^ i=0 j^i 

Now we easily prove that for alH = 0, . . . , 



E n 774TT-n 



TBt jj^i, j&T^^i /i) j^i^j fi) 

To see this, it is enough to prove it for i = 0. Denoting Xj = fo/{fj — fo), equality 

( TTSl) for i = becomes Y1tc{i d} YljeT -^j ~ Y['j=ii^~^-^j) which is obviously true and 
proves (fT2|) . The remainder of the theorem is plain. □. 

The next proposition is a remark about the weights of a face and a vertex for Bc{ao, ■ ■ ■ ,ad 
when this distribution exists: 

Proposition 4.4: If i? ~ Bc{ao, . . . , a^) denote a = ao + - ■ ■+ad and a' = 0^+1 + - ■ -+0^. 
Then 

Pr(5o = ... = 5. = 0) = S^H^, Fr{B = ei ^faMa. + c) 



r(a + c)r(a')' ' ' T{a + c)T{ai)' 



Proof: By definition Tc{B) = Ta+d^) /Ta{X) where X ~ 'D{aQ, . . . , a^). We are willing 
to use Property 5 of Theorem 2.1 and we consider 

. ^ fn""' f Xo°"^..x?l'"Vl-Xo Xd-iT"-^ 

T,(i?)(/o,...,/o,l,...,l) - ^° ' ° d-i I d i; 



S(ao, . . . , ad) y^,^ ((/o - 1) (xo + ■ ■ ■ + Xfc) + 1)"+'= 

EF 
S(ao,...,a,) 

where 

i{0,oo)fe+i (1 + Mo H h Mfe)'^+'= 

and 

F= / x^+Y"^ • • -^^d-l'^^ll-a^fc+i a;d„i)''^"Mxo...cixd_i = ^(0^+1, . . . , a^). 



Equality (1191) is obtained by making the change of variable Ui = f^Xi ioi i = 0, . . . , k 
and taking the limit when /o — ?■ 00. The second equality of (l20l) can be easily proved 
by applying to A = 1 + uq + ■ ■ ■ + Uk the equality 



A'^^" Jo r(a + c) ' 



5 Quasi Bernoulli and Dirichlet processes. 

Recall that if (fi, a) is a measure space such that a{Q) = a is finite, the Dirichlet 
process with parameter a is a random probability P ~ ^^(a) on Q such that for any 
partition [Aq, . . . , A^) of Q then 

{P{Ao), P{A,)) ~ V{a{Ao), a{Ad)). 

While the term 'process' is questionable since no idea of time is involved in this concept, 
it is now well ingrained in the literature; the reason is that when Q is the interval [0, T] 
and a is the Lebesgue measure, the distribution function P{[0,t]} for < t < T 
is Y(t)/Y{T) where Y is the standard Gamma Levy process. A striking property of 
P ~ 'P(a) is that it is almost surely purely atomic: if [Vj)'^-^ are iid random variables on 
Q with distribution Q = a/a there exists random weights {Wj)'^-^ (that is Wj > and 
Xljli — 1) such that YlJLi ^j^Vj ~ 'D{a). The exact description of the distribution 
of {Wj)j>i can be found in the fundamental paper by Ferguson [5]. A large amount of 
literature has followed [5]. The long paper by James, Lijoi and Prunster [7]contains a 
wealth of information on the Dirichlet process P ~ T>{a) and on the distributions of 
the fonctionals f{w)P{dw) with numerous references. The present section describes 
the analogous random probability P ~ Bk{a) which is such that for any partition 
{Aq, . . . , Ad) of Q we have 



(P(Ao), . . . , P{Ad)) ~ Bk{a{Ao), aiAd)). 



(21) 



The object Bk{<y) is natural since for = 1 the random probabihty P = Sy where 
V ^ a/a satisfies f l2ip . Not surprisingly, we will see that random probabilities on 
Q satisfying (12T]) are concentrated on at most k points Vi,...,Vk where Vi ~ a/ a, 
although they will not be independent as they are in the limiting case of the Dirichlet 
process. Needless to say, the distribution of the random weights on these atoms will 
not be simpler than in the limiting case. 

Before stating the theorem for general k we sketch the construction of (a). We 
first select Vi ~ a/ a. Having done this, with probability l/(a + 1) we take V2 = V\ and 
with probability a/(a + 1) we choose V2 independently from V\ with distribution aja. 
Finally we take W\ uniform on (0, 1) and W2 = 1 — W\ and we consider the random 
probability 

For seeing that ( 12 ip is fulfilled for = 2 we denote = ai^A^ for simplicity; we observe 
that the probability for which (P(Ao), . . . , P(Ad)) is equal to e.j where i = 0, . . . , c? is 
exactly 

a + 1 a a + la^ (0)2 

On the other hand for i ^ j we have Pr(Vi G Ai, V2 ^ Aj) = As a consequence 
the conditional distribution of {P{Aq), . . . ,P{Ad)) knowing that Vi G A^, V2 G Aj is 
WiCi + W2ej. These two facts show that f l2T]) is correct for k = 2. 

Theorem 5.1. Let (fi, a) be a measure space such that a{Q) = a is finite and let k 
be a positive integer. We select the random variables (M, X, W) as follows: 

1. M = (Ml, ... , Mfc) G N'' are such that Mi + 2M2 H h kMk = k, with the 

Ewens distribution with parameters k and a: 

Pr((Mi, . . . , Mfc) = (mi, . . . , m^)) = C(m)- 



where 



C{m) = C(mi, . . . ,mfc) 



(a)fc 
k\ 



We denote Sj = MiH \-Mj and B{M) = (6t)fii where bt = j for Sj^i < t < Sj. 

2. We select i.i.d. random variables X = (X()^i such that Xt ~ a/ a. 

3. We select W = {Wt)ti ~ V{B{M)). 

4. When conditioned on M the random variables X and W are independent. 
Then P = J2t=i Wt5xt satisfies ([2l]). 

Comments. 



1 o 



1. We say that m — (mi, . . . , m^) is the portrait of a permutation tt of {1, . . . , A;} if 
TT has rrij cycles of order j for j = 1,2, ... ,k. Therefore C(rri) is the number of 
permutations with portrait m. For the history and the properties of the Ewens 
distribution, Johnson, Kotz and Balakrishnan [8] Chapter 41 is a good reference. 
Note that m — (mi, . . . , nik) can be seen as the coding of a partition of the integer 
k. For instance iik — !?> and if the partition is represented by the Ferrers diagram 

o o o o o 

o o 

o o 

o o 

o 

o 

corresponding to the partition 1 + 1 + 2 + 2 + 2 + 5= 13 then m = (2, 3, 0, 0,5,.. .) 
where the dots mean a sequence of 8 zeros. The ^^^i m^ which is the height of 

the Ferrers diagram is equal to 6 in the example. Finally the sequence {ht)i^^^^ 
mentioned in part 1) of the theorem is also another coding of the partition and 
describes the lengths of the rows of the Ferrers diagram from below. In the above 
example, {hi, 62, &3, &4, h) = (1, 1, 2, 2, 2, 5). 

2. Let us consider the theorem for = 3. In this case 

Pr(M = (3, 0, 0)) = Pr(M = (1, 1, 0)) = Pr(M = (0, 0, 1)) = --, 

{a)3 W3 (0)3 

B{3, 0, 0) = (1, 1, 1), B{1, 1, 0) = (1, 2), B{0, 0, 1) = (3). 

Therefore, 

• p = WiSx, + W25x2 + W^5x^ with {Wi, W2, W3) ~ ^3(1, 1, 1) under condi- 
tioning on M = (3, 0, 0); 

• P = Wi5xi + W2SX2 with {Wi, W2) ~ -63(1, 2) under conditioning on M = 

(1,1,0); 

• P = Sxi under conditioning on M = (0, 0, 1). 

3. To illustrate the notation of Theorem 5.1, let us come back once more to the case 
k = 2. Above, we took informally first Vi ~ a/a, then we took V2 with a mixed 
distribution ^^y^ + and finally we took P = WiSy^ + H^2^y2- With the new 
notation, M takes two values 

• (Mi,M2) = (0,1) with probability l/(a + 1); in this case Xi ^ Vi ^ V2, 
5(0,1) = (2) and P = 6x^; 

• (Ml, M2) = (2, 0) with probability a/(a + l); in this case B{2, 0) = (1, 1), the 
random probability P has in general two atoms Xi and X2 (at least when a 
has no atoms) and they are mixed by {Wi, W2) — {Wi, 1 — Wi) ~ V{1, 1), 
that is to say with Wi uniform on (0, 1). 



4. If we consider the particular case where Q = (0, 1) and where a is the Lebesgue 
measure (therefore a = 1) the random probabihty P ~ Bk{a) on (0,1) will be 
built according to Theorem 5.1 as follows: for creating M we take a permutation 
TT of {1, . . . ,k} with uniform distribution; we consider M = (Mi, . . . , Mk) which 
are the number of the cycles of n of size 1, . . . , fc, respectively; the sequence M 
induces a partition B{M) of the integer k; we take independent random vari- 
ables Xi . . . , XMi+-+Mk uniformly distributed on (0, 1); they will be the points 
of discontinuity of the random distribution function F{t) = P([0,t]); we take fi- 
nally a Dirichlet random variable W = {Wt)fi^^"'^^^^'' ~ V{B{M)) and Wt is the 
amplitude of the jump of the random process F in Xt. 

5. We observe that the idea of the Tc transform extends well to the context of random 
probabilities on f2. If / is a positive measurable function on f2, if c > and if P 
is a random probability on Vt we define 

Te(P)(/) = E (^{jj{w)P{dw))-'^ < oo 

which is finite in particular if there exists m > such that f{w) > m for all 
w G f2. If P = X ~ is a Dirichlet process such that a = a{Q), formula ([5]) 

or [4] page 35 shows that 

T,(X)(/) =E(^(^/(«;)X(rf«;))-"^ = exp - ^ log /(«;)«(rf«;). 

An interesting application of Proposition 3.1 gives the following when P = P ~ 
Bk{a) is the quasi Bernoulli process of Theorem 5.1. Denoting crjif) = 
we have the elegant result 

i fiw)Bidw)r'] = -- Yl U-^ 



For instance for k = 2 

T2mf) = 



a)u — ' ^ ^ y"-3mA 

' (mi,...,mj.)eNfc J = l •' ■> 
vn-^ +2m2 H |-fcm^ — fc 



1 / /" OL[dwY\ 1 /" OL{dw) 



a{a + 1) f{w) J a{a + 1) f{wY ' 

6. Needless to say, the formula which is the backbone of the paper, namely 



Tk{B){f) 



Ta{X){f) 



still holds for a Dirichlet process X ~ ^^(a) and a quasi Bernoulli process B ~ 
Bk{a)- As a consequence, X ~ (1 — Y)X + YB holds when Y ~ I3{k,a) is 
independent of B. 
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Proof of Theorem 5.1. To show (^T^ we denote Oj = a{Ai). We compute first 
the distribution of Z = {P{Aq), . . . , P{A(i)) by conditioning with respect to M,X. 
Denote by Nij the number of Xt such that Sj-i < t < Sj and Xt G Ai. Note that 

Sf=o ^hj ~ ^j- '^^^ distribution of the vector Nj = (A^jj)f=o °f ^'^^^ is a muhinomial 
distribution 

Pr(iV, = (no,,-,...,nd,,)) - ' ° ^ 



where Hqj + ■ ■ ■ + n^j = Mj. Furthermore, Ni, . . . , Nj. are independent. Now we intro- 
duce the following quantities Bi = Yl^ii^hj- They satisfy Xlf=o Bi = Ylt=o Yl^j=ij^i,j 



Sj=i j^j = We now observe that conditionally on M and X we have Z ~ "D^Bq, . . . , Bd) 
To see this we use the definition of Z for writing 



k k 

Yl Wtlt.,x,eAo,---,Yl Yl ^th;X,eAd 

j=l Sj^i<t<Sj j=l Sj^i<t<Sj 

A property of the Dirichlet distribution is that if bi = X^jLi ^ij with Ojj > and 
i = 0, . . . ,d, if 

{^ij)o<i<d, l<j<ki ~ (('3ij)o<i<d, l<i<fcj 

and if Yi = XljLi -^ij then {Yq, . . . , Yd) ~ 1^{bQ, . . . , 6^). A quick way to see this is to 
use ([2]). We apply this principle to {Xij) = [Wt), to ki = k, to at = aij = j when 
^Sj-i<t<Sj^ and to 1^ = P{Ai). We obtain: 

(k k \ 

YjNoj, ■ ■ ■ , I] jiV,,, = ViBo, ...,Bd). 
i=i i=i / 

The last step of the proof removes the conditioning on X and M. Here in (125]) below 
we use the notation aj introduced in (j7]): 

^(t7^) = ^(^(t7^\M,x]]=E{E{-,,-^-^\M,X\\ (22) 



Id 
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EE— ^\M,X\\ (23) 

^[U^^ fm/^fJ^Nj M^X^'j (24) 

(A: ^ \ , . k rrij 

i = l / ^ ' (mi,...,mfe)eN'=; j = l 



mi +2m9H \'kmi.=k 



1 



Line (12^ comes from Z ~ T>{Bq, . . . ^Bd) when conditioned on (M, X) and from ([2]), 
line (12^ comes from the definition of Bq, . . . , 5^, hne (12^ comes from the independence 
of the A'j 's, hne fl25|) comes from the generating function of a multinomial distribution. 
The formula ([9]) proves that Z ~ ;Bfc(ao, . . . , a^). □ 

6 A Markov chain on the tetrahedron 

This section gives an application of Theorems 1.1 and 4.1, does not contain new re- 
sults and serves as a conclusion. Consider the homogeneous Markov chain (X(n))„>o 
valued in the convex hull -E^+i of (co, • ■ • , Gd) with the following transition process: 
Given X{n) G E^+i choose randomly a point B{n + 1) G -E^+i such that B{n + 1) ~ 
i3fc(ao, . . . , Od) and independently a random number Fn+i ~ f3{k,a). Now draw the 
segment {X{n), B{n + 1)) and take the point 

X{n + 1) = X{n){l - + B{n)Yr,+, 

on this segment. Theorem 4.1 says that the Dirichlet distribution V{ao, ■ ■ ■ ,ad) is a 
stationary distribution for the Markov chain {X{n))n>o- Recall the following principle 
(see [3] Proposition 1): 

Theorem 6.1: If is a metric space and if C is the set of continuous maps f : E ^ E 
let us fix a probability z/((i/) on C. Let Fi, F2, . . . be a sequence of independent random 
variables on C with the same distribution u. Define Wn = Fn o . . . o F2 o Fi and 
Zn = Fi o . . . o Fn-i o Fn. Supposc that almost surely Z = lim„ Zn{x) exists in E and 
does not depend on x E E. Then 

1. The distribution /x of Z is a stationary distribution of the Markov chain {Wn{x))n>o 
on E; 

2. if X and Fi are independent and if X ~ Fi{X) then X ~ /x. 

Choose E = Ed+i- Apply Theorem 6.1 to the distribution v of the random map Fi 
on Ed+i defined by Fi(x) = (1 - Yi)x + YiB{l) where Yi ~ /3(A;,a) and B{1) ~ 
i3fc(ao, . . . , ttd) are independent. If the F„ defined by Fn{x) = (1 — Yn)x + YnB{n) are 
independent with distribution i/, clearly 

71 n /fc— 1 \ 

^n(x) = (n(l - + E 11(1 - Y^) Y,B{k) 
j=l k=l \j=l / 

converges almost surely to the sum of the following converging series 

^ = E 11(1 - r,) Y,B{k) (26) 
k=i \j=i / 

and therefore the hypotheses of Theorem 6.1 are met. As a consequence the Dirichlet 
law 2^(00, . . . , ad) is the unique stationary distribution of the Markov chain {X{n))n>o 
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and is the distribution of Z defined by (1261) . Finally recall the definition of a perpetuity 
[6] on an affine space A. Let v{df^ be a probability on the space of affine transformations 
/ mapping A into itself. We say that the probability /i on ^ is a perpetuity generated 
by if X ~ when F ~ z/ and X ~ /i are independent. If the conditions of 

Theorem 6.1 are met for z/, there is exactly one perpetuity generated by v. Theorems 
1.1, 4.1 and 4.3 say that the Dirichlet distribution is a perpetuity for the random affine 
map F{x) = (1 — Y^x + Y B on the affine hyperplane A of ]R^+^ containing Cq, . . . , 
generated by various distributions v of {\ — Y^ YB). Theorem 6.1 shows that a Dirichlet 
process is also a perpetuity generated by the distribution of {1 — Y, YB) where the set 
of probabilities on Q replaces the tetrahedra with d + 1 vertices and where Y ~ /3{k, a) 
is independent of the quasi Bernoulli process Bk{a). 
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